Prevalence of SARS-Cov-2 antibodies and living conditions: the French national random population-based EPICOV cohort

Background We aimed to estimate the seroprevalence of SARS-CoV-2 infection in France and to identify the populations most exposed during the first epidemic wave. Methods Random selection of individuals aged 15 years or over, from the national tax register (96% coverage). Socio-economic data, migration history, and living conditions were collected via self-computer-assisted-web or computer-assisted-telephone interviews. Home self-sampling was performed for a random subsample, to detect IgG antibodies against spike protein (Euroimmun), and neutralizing antibodies with in-house assays, in dried blood spots (DBS). Results The questionnaire was completed by 134,391 participants from May 2nd to June 2st, 2020, including 17,441 eligible for DBS 12,114 of whom were tested. ELISA-S seroprevalence was 4.5% [95% CI 3.9–5.0] overall, reaching up to 10% in the two most affected areas. High-density residences, larger household size, having reported a suspected COVID-19 case in the household, working in healthcare, being of intermediate age and non-daily tobacco smoking were independently associated with seropositivity, whereas living with children or adolescents did not remain associated after adjustment for household size. Adjustment for both residential density and household size accounted for much of the higher seroprevalence in immigrants born outside Europe, twice that in French natives in univariate analysis. Conclusion The EPICOV cohort is one of the largest national representative population-based seroprevalence surveys for COVID-19. It shows the major role of contextual living conditions in the initial spread of COVID-19 in France, during which the availability of masks and virological tests was limited. Supplementary Information The online version contains supplementary material available at 10.1186/s12879-021-06973-0.


Introduction
The COVID-19 pandemic has highlighted the paramount importance of public health surveys including assessments of seroprevalence for estimating the cumulative incidence of SARS-CoV-2 infection at population level. Evaluations limited to data for confirmed cases or deaths greatly underestimate disease propagation, due to the large proportion of mildly affected or asymptomatic individuals and the lack of RT-PCR screening tests at the start of the pandemic [1]. Nationwide-representative population antibody studies have been conducted in few countries to assess SARS-CoV-2 circulation, but rarely on random sample from general population [2].
France has been severely affected by COVID-19, but disease burden has been uneven across the country. Concerns about the contributions of social inequalities to spatial variations of COVID-19 exposure or severity have been raised [3], but most of the available data are based on deaths, hospitalization or reported cases [4].
EpiCOV is a large French national random populationbased public health study including serological testing and longitudinal follow-up, aiming at both analysing the impact of living conditions on the dynamics of the epidemic, and the impact of the epidemic on health and living conditions [5].
Here, we aimed to provide a national estimate of SARS-Cov2 seroprevalence in France in May 2020, at the end of the first lockdown, and to identify the most exposed populations in terms of living and socio-economic conditions.

Study design
Individuals aged 15 years or older living in mainland France or three of the five French overseas territories were randomly selected from the FIDELI administrative sampling frame. FIDELI covers 96.4% of the population living in France, providing postal addresses for all individuals, and an e-mail address or telephone number for 83%.
Sampling was stratified for two criteria: administrative area (départements-equivalent to counties-in mainland France and three overseas territories), and a binary indicator of poverty defined on the basis of a threshold of 60% of the median national per capita household income. A differential sampling fraction was used to ensure overrepresentation of the less densely populated départements and people with lower incomes, for which lower response rates were expected. Individuals living in residential care homes for the elderly were excluded.

Multimodal data collection
All selected individuals were contacted by post, e-mail and text messages (SMS), with up to seven reminders. Self-computer-assisted-web (CAWI) or computerassisted-telephone interviews (CATI) was offered to a random subsample of 20%. The remaining 80% were assigned to CAWI exclusively.

Home blood self-sampling and serological testing
Home capillary blood self-sampling was proposed during the web/telephone questionnaire. Dried-blood spots were collected on 903Whatman paper (DBS) kits set to each participant agreeing to blood sampling mailed to the central biobank (Robert Pellegrin Hospital, Bordeaux) to be punched with a PantheraTM machine (Perkin Elmer). Eluates were processed in the virology laboratory (Unité des virus Emergents, Marseille) with a commercial ELISA kit (Euroimmun ® , Lübeck, Germany) for detecting anti-SARS-CoV-2 antibodies (IgG) against the S1 domain of the viral spike protein (ELISA-S), according to the manufacturer's instructions. All samples with an ELISA-S test optical density ratio ≥ 0.7 were also tested with an in-house microneutralization assay to detect neutralizing anti-SARS-CoV-2 antibodies. For this assay, VeroE6 cells cultured in 96-well microplates, 100 TCID 50 of the SARS-CoV-2 strain BavPat1 (courtesy of Prof. Drosten, Berlin, Germany) and serial dilutions of serum (1/20-1/160) were used, as described elsewhere [6]. Dilutions associated with the presence or absence of a cytopathic effect on 4.5 days after infection were considered negative and positive, respectively. The virus neutralization titer (VNT) referred to the highest dilution of serum with a positive result. Specimens with a VNT ≥ 40 were considered positive, as the specificity at this threshold was 100% on 486 samples collect before the emergence of SARS-Soc-2 in 2017.
For the first round of the study in May 2020, due to the logistic complexity of such rapid implementation, a national mainland subsample and six department subsamples were randomly selected for testing, including those with the highest COVID-19 prevalences at the time.

Outcome
Seroprevalence was estimated as the proportion of the individuals tested with an ELISA-S ratio ≥ 1.1 (ELISA S +), according to the ratio threshold supplied by the manufacturer, considered as the main criteria. We also considered the proportion of individuals with neutralizing antibodies with titres ≥ 40 (SN+). Two more sensitive estimates of seroprevalence were provided: the proportion of individuals with an ELISA-S ratio ≥ 0.7, the threshold for the microneutralization assay, and the proportion of individuals with an ELISA-S+ or SN+ result.

Exposure
We considered the contextual variables, living conditions, and individual characteristics.
As contextual variables, we considered the quintile of hospitalisation for COVID-19 and the sextile of COVID-19 death rate cumulated until the first week of May at department level, the population density in municipality of residence, and whether the neighbourhood was considered socially deprived, in accordance with national definitions for prioritising targeted socio-economic interventions.
Living conditions included the number and age of the people living in the household, overcrowding (defined as at least two people living in less than 18 m 2 per person), and whether one of the other members of the household was reported to have had fever, cough or a positive virological test since January 2020 (suspected COVID-19 case).
The individual characteristics recorded included gender, age, tobacco use, the decile income of the household per capita, diplomas, occupation and migration history.

Ethics and reglementary issues
This study was performed in accordance with the relevant guidelines and regulations. The survey was approved by the CNIL (the French data protection authority) (ref: MLD/MFI/AR205138) and the ethics committee (Comité de Protection des Personnes Sud Meediterranee III 2020-A01191-38) on April 2020. The survey was also approved by the "Comité du Label de la Statistique Publique". All participants or their legally authorized representatives had provided informed consent to participation in this study. The serological results were sent to the participants by post with information about interpreting individual test results.

Statistical analysis
SARS-Cov-2 seroprevalence was estimated with 95% confidence intervals at the national level and by geographic area, contextual variables, housing conditions, and individual characteristics. Multivariate logistic regression models included non-collinear variables identified as potential risk factors, and variables with p-values < 0.20 in univariate analysis. Univariate and multivariate analyses were conducted with ELISA-S+ as the main outcome. We considered the subpopulation of individuals not living alone for investigating the effects of the number of people living in the household, the presence of a minor (under 18 years of age) and a suspected COVID-19 case among household members.

Non-response adjustment weights
Final calibrated weights were calculated to correct for nonresponse, as detailed elsewhere [5]. The sampling weight (the inverse of inclusion probability) was first divided by the probability of response estimated with logit models adjusted for auxiliary variables potentially linked to both the response mechanism and the main variables of interest in the EpiCov survey. The Fideli sampling frame provided a wide range of auxiliary variables, including the socio-demographic variables, income distribution classes, quality of contact information, and contextual variables, such as population density, the proportion of people aged over 65 years or below the poverty line in the area, obtained by georeferencing information. Response homogeneity groups were then derived from this estimated probability (established within each department for correction for non-response to the common short questionnaire). The response probability was then estimated from the percentage of respondents in each homogeneity group, yielding first-step weights.
In the second step, these weights were calibrated according to the margins of the population census data and population projections for several variables (10year age categories, sex, département, diploma level, and region). Weights for the serological subsample were calibrated at national and local level for the six overrepresented areas. This calculation was designed to decrease the variance and the residual bias for variables correlated with margins.
The sampling design was taken into account for estimating prevalence, and confidence intervals in statistical tests, and crude and adjusted odds ratio in logistic regression models.
Analyses were performed with SAS proc survey and STATA svy procedures.

Results
We selected 371,000 people aged 15 years or over at random, 134,391 of whom completed the questionnaire from May 2th to June 2th 2020. Within the random subsample of 17,123 people living in mainland France eligible for home testing, 14,995 agreed to receive the kit, 12,423 sent the DBS sample to the biobank and 12,114 samples could be analyzed (Fig. 1). The median date for blood sampling was May 21st 2020 (IQR 18th-28th May).

Relationships between contextual living conditions and ELISA-S+ seropositivity (Tables 2 and 3)
In the two regions most affected by the epidemic, Ile-de-France and Grand-Est, prevalence was highest in metropolitan areas. Seroprevalence (ELISA-S+) in individuals living in densely populated municipalities was twice (6.4%) that of individuals living in zones of moderate (3.4%) or low (3.3%) population density. Socially deprived neighborhoods had rates twice those of non-deprived (8.2% versus 4.2%; p = 0.019), and overcrowded housing was associated with a doubling of seroprevalence (9.2% versus 4.3%; p < 0.001).
Seroprevalence increased strongly with the number of people living in the same dwelling, from 2.1% for people living alone, to 8.5% for households with more than four members (p = 0.017). It was higher in households of more than one person including a minor (4.0% vs. 1.2%; p < 0.001). This association disappeared after adjustment for household size (Additional file 1: Table S2).
Seroprevalence was higher for participants reporting that another member of the household had presented symptoms or had a positive PCR test (12.9% versus 4.0%; p < 0.001). This association was not affected by adjustment for household size, the presence of minors or population density of the living municipality (Additional file 1: Table S2).

Relationships between individual characteristics and ELISA-S+ seropositivity (Tables 2, 3, 4)
Seroprevalence tended to be higher in women than in men (5.0% versus 3.9%; p = 0.054), and increased with age, from 3.6% in people aged 15-20 to 6.9% in those aged 30-49 years, before decreasing to 1.3% in those aged 65 or over (p < 0.001). Daily smokers had a lower likelihood of having antibodies than occasional, former or non-smokers, in whom seroprevalence was similar (2.8% vs. 5%; p = 0.031).
Seroprevalence was highest in healthcare professionals (11.4%), twice that in people with other occupations self-reported as essential (5.2%) or non-essential (5.7%) during the first national lockdown (p = 0.002). Seroprevalence was 3.0%in individuals with no professional occupation.
The individuals with the lowest level of education had the lowest seroprevalence (2.8%), below those who had completed high school (5.8%) or at least a bachelor's degree (6.2%) (p < 0.001). Concerning family income per capita, the highest seroprevalence (5% to 6%) was  Table 1 Prevalence of antibodies against SARS-CoV-2 1 in people living in France 2 at the end of the first lockdown according to cumulative hospitalisation and death rates cumulated until the first week of May at départment level: the national EpiCov cohort, round 1-May 2020 Bold is used to underline % and OR 1 Home sampling for finger prick/Euroimmun ELISA-S and seroneutralization tests 2 People aged 15 or over, residing in mainland France, but not in care homes for the elderly or prisons 3 The sampling design is taken into account for the estimation of prevalence, confidence intervals, with the SAS procsurvey procedure. The percentages are weighted by sampling weight (the inverse of inclusion probability), corrected for non-response probability and calibrated on the margin of the census. The prevalences are not equal to n/N   observed for the two lowest and the two highest deciles, with lower rates (about 3%) for central deciles (p = 0.007). Immigration status was significantly linked to seroprevalence, which was higher in first-and second-generation immigrants born outside Europe (9.4% and 6.2%, respectively) than in non-immigrants (4.1%), or first-and second-generation immigrants from Europe (4.8 and 3.6%, respectively). The relationship between seroprevalence and immigration status from outside Europe was unaffected by adjustment for individual factors, but disappeared after adjustment for both residential population density and household size: crude ORs were 2.  (Table 4).

Sensitivity analyses
Similar associations (Additional file 1: Tables S3, S4) were found when the analysis was restricted to individuals living in the two most affected regions (N = 5557).
Similar patterns were also observed for the proportion of individuals with SN titre ≥ 40 (Additional file 1: Table S4). Bold is used to underline % and OR 1 Home sampling by finger prick/Euroimmun ELISA-S test 2 People aged 15 years or over residing in mainland France, outside residential housing for the elderly and prisons 3 The sampling design is taken into account for the estimation of prevalence, confidence intervals and statistical tests, with the SAS procsurvey procedure. The percentages are weighted by sampling weight (the inverse of inclusion probability), corrected for non-response probability and calibrated on the margin of the census. The prevalences are not equal to n/N 4 Living in a housing area with less than 18 m 2 per inhabitant 5 Other members of the household reported by the participant as having had symptoms or positive PCR tests since February 2020 6 First national lockdown in France: March 17th to May 11th 7 First-generation immigrants: born non-French outside France and living permanently in France (including those who subsequently acquired French nationality) 8 Second-generation immigrants: born and living in France, with at least one parent being a first-generation immigrant 9 Including medical and paramedical professionals, Firefighters, Pharmacists and ambulance drivers (but not including hospital cleaners, for example) 10 Home helps or housekeepers, food shop workers, delivery drivers, public transportation drivers, cab drivers, bank customer service or reception staff, petrol station employees, police officers, postal workers, cleaning staff, security guards, construction workers, truck drivers, farmers and social workers

Discussion
Epicov, designed in March 2020, just before the first national lockdown in France, aimed to estimate the proportion of the population aged 15 years or over exposed to SARS-Cov2, and to identify the subpopulations most exposed during the first epidemic wave. Overall seroprevalence was 4.5% [3.9-5.0], according to the cut-offs recommended by the manufacturer for the Euroimmun ELISA-S test that was applied on home self-sampled dried blood spots. Only two other national serological studies based on random general population samples were performed at the same period, in Spain [7] and England [8]. They reported a prevalence of seropositivity for IgG antibodies of a similar magnitude to that in France, with a similar range of geographic disparities.
EpiCov was designed to study the effects of contextual living conditions. It showed that these conditions played a major role in the initial spread of the virus, accounting for a large proportion of exposure heterogeneity. Population density at the place of residence and household size were strongly associated with ELISA-S seropositivity, independently of individual socio-demographic and occupational characteristics. The availability of masks and tests was extremely limited until May 2020. People living in the most populous areas had little opportunity for physical distancing in current life activities outside home, particularly before lockdown.
Adjustment for both residential population density and household size accounted for much of the higher seroprevalence in immigrants outside Europe, which was twice that of the native population, whereas seroprevalence was similar in immigrants from European countries and the native population. These findings highlight the role of the spatial segregation of populations originating from low-and middle-income countries [9,10]. Higher levels of exposure may account for part of the higher burden of COVID-19 mortality in these populations [4].
Poor socio-economic status was associated with severe COVID-19 infection [11,12]. We found a more complex pattern for relationships with seroprevalence, which was highest in the two highest and lowest deciles of family income per capita, and lowest in the individuals with the lowest level of education. This probably reflects the combination of both high exposure to COVID-19 in qualified individuals working in care professions or having multiple social activities before lockdown, and high exposure of more deprived people living in overcrowded housing in densely populated areas, with less opportunity to telework during lockdown [13]. Seroprevalence in healthcare professionals was twice that in individuals with other occupations. Healthcare workers were highly exposed to the infection during the first wave, given the shortage of surgical masks and their proximity with patients [7,8,14].
Seroprevalence did not differ significantly between women and men, after adjustment for contextual and individual characteristics, including professional activity, consistent with most studies conducted in France [15,16] and elsewhere [2,7,8]. Men seem to be more susceptible to develop severe forms of the infection than women [17], but there is no evidence of any difference in the probability of being infected [18].
Seroprevalence was highest at intermediate ages. Most population-based serological studies have reported a lower seroprevalence in the elderly [7,8,14]. Older Bold is used to underline % and OR 1 Home sampling for finger prick/Euroimmun ELISA-S test 2 People aged 15 or over, living in mainland France, but not in residential care homes for the elderly or prisons 3 The sampling design is taken into account for the estimation of prevalence, crude and adjusted odds ratios, confidence intervals and tests, with the SAS procsurvey procedure. The percentages are weighted by sampling weight (the inverse of e inclusion probability), corrected for non-response probability and calibrated on the margin of the census. The prevalences are not equal to n/N 4 First-generation immigrants: born non-French outside France and living permanently in France (including those who subsequently acquired French nationality) 5 Second-generation immigrants: born and living in France, with at least one parent a first-generation immigrant 6 Including medical and paramedical professionals, Firefighters, Pharmacists and ambulance drivers (but not including hospital cleaners, for example) 7 Home helps or housekeepers, food shop workers, delivery drivers, public transportation drivers, cab drivers, bank customer service or reception staff, petrol station employees, police officers, postal workers, cleaning staff, security guards, construction workers, truck drivers, farmers and social workers Univariate analysis 3 Multivatiate analysis 3   Bold is used to underline % and OR 1 Home sampling for finger prick/Euroimmun ELISA-S test 2 People aged 15 or over, living in mainland France, but not in residential care homes for the elderly or prisons 3 The sampling design is taken into account for the estimation of prevalence, crude and adjusted odds ratios, confidence intervals and tests, with the SAS procsurvey procedure. The percentages are weighted by sampling weight (the inverse of e inclusion probability), corrected for non-response probability and calibrated on the margin of the census. The prevalences are not equal to n/N. In each bivariate models, P-values are systematically given for the immigration status and for the corresponding contextual or individual adjustement variable 4 First-generation immigrants: born non-French outside France and living permanently in France (including those who subsequently acquired French nationality) 5 Second-generation immigrants: born and living in France, with at least one parent a first-generation immigrant people, at least those not living in care homes, are likely to have had fewer social interactions since being told to stay at home at the start of the outbreak. Lower rates in adolescents and young adults than in mid-age range adults have been reported in some studies [7,19] including ours, but not in others [8,20], and may be partly explained by school closures at the start of lockdown in France. Seropositivity was strongly associated with possible cases of infection in the same household, regardless of local population density, household size and composition. This finding is consistent with the higher risk of secondary infections among people living with others [7,8,21]. After adjustment for household size, seropositivity was not associated with living with a child or an adolescent under the age of 18 years. Similar results were reported in the English national seroprevalence study [8]. This finding is also consistent with smaller studies showing that the mean household secondary attack rate from adults is not significantly different from that from children, as reported in a meta-analysis [21]. By contrast, a study conducted during the same period in population cohorts in three regions of France with similar home self-sampling reported a higher seroprevalence for individuals living in households containing a young below 18 years [20]. It remains unclear whether children play a major role in intra-household transmission, which is a crucial issue, because the benefits of school closure for preventing disease spread have to be weighed up against potential psychological effects and increases in educational inequalities. We found a strong inverse association between the presence of SARS-Cov-2 antibodies and smoking at the time of the EpiCov study, as in other studies [8,20]. The possibility of biological mechanisms preventing infection in some smokers must be counterbalanced by evidence for higher rates of severe forms of COVID-19 in infected smokers [22].

Strengths
The Epicov cohort is one of the largest national representative population-based surveys of seroprevalence in individuals aged 15 years and over, performed during an extremely challenging period, before summer 2020. It identified the population most affected by the initial spread of the new virus in the population, providing a basis for evaluating subsequent changes in epidemiological context and access to preventive strategies. People living below the poverty line were voluntarily over-represented in the sampling, and detailed socio-economic and migration data were available. We were therefore able to perform a powerful analysis focusing on social inequalities.
The home self-sampling with DBS detection of SARS CoV-2 antibodies limited self-selection bias, and was ideally suited to the context of the first lockdown. The acceptance of home sample was 88% and the return rate was 83%, higher than the 85% and 70% assumed for the calculation of sample size.
Non-response is a known crucial issue affecting the representativeness of population-based studies. In the EpiCov Study, a high coverage of the sampling frame, together with mixed-mode (web/telephone) data collection resulted in high quality in terms of representativeness [23]. Many auxiliary demographic and socio-economic variables were available from the sampling frame, which made it possible to correct a large part of the non-response bias. Moreover, the multimodal approach of the EpiCov provided an exceptional opportunity to correct for endogenous selfselection bias, as detailed elsewhere [5]. This bias due to the people most concerned more likely than others to participate in the study, occurs in studies dealing with topics with considerable media coverage.

Limitations
People living in residences for the elderly were not covered by EpiCov. We cannot exclude we also missed nondependent elderly individuals, due to hospitalization at the time of the survey, potentially contributing to the lower seroprevalence observed among the elderly.
The Euroimmun ELISA-S test has a sensitivity of 94.4%, according to the manufacturer's cutoff. It has been evaluated in various studies, which reported a specificity ranging from 96.2 to 100% and sensitivity ranging from 86.4 to 100% [24][25][26]. Anti-Sars-Cov2 IgG antibody levels have been reported to decline rapidly, particularly in the elderly and in subjects with mild or asymptomatic forms [1,27,28]. ELISA-S IgG antibody levels may therefore have been under the manufacturer's cut-off for some of those previously infected, With a lower threshold (0.7), seroprevalence reached 7.1% [6.4-7.8] corresponding to 3.74 million people (3.36-4.13), close to the national projections based on surveillance data [29].
EpiCov is the only national representative study to date to have reported an estimated prevalence of neutralising antibodies, at 4.1% [3.6-4.7]. Neutralising antibodies with a titre ≥ 40 were detected in only 70% of people ELISA-Spositive for IgG antibodies, and were also detected in 30% of participants with lower ELISA-S ratios. Several studies have reported an inverse relationship between neutralising antibody development and disease severity, but the cause-effect relationship remains unclear [30]. Neutralising antibodies may be more associated with protection against future infection, increasing survival and protection against re-infection with SARS-CoV-2 strains [31].